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e Mean = Mode = Median 
o symmetrical, bell-shape 
o skewness = kurtosis = 0 


e described by only two parameters 


o Mu, oœ): mean, variance 


e continuous probability distribution 
o of many random variables 





o body temperature, IQ, heart rate 


A Normal distribution is symmetric about the mean, values close the mean having higher probability than values far from the mean. 
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allow comparison of values with different scales 





# find the probability of Z-score of 1.65 
norm.cdf(1.65, loc = O, scale = 1) 
>>> 0.950 
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e CDF of the standard normal 
distribution (u = 0; o = 1) 


# find the probability of Z-score of 1.65 
1 - stats.norm.cdf(2.1) 

# stats.norm.sf(2.1) 

>>> 0.0178 


Density function 


p-value = 0.01786442 











Confidence Intervals 





95% Interval 


I = pe + 1.96 * a 


0.025 area 








An illustration of 95% confidence 
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Confidence Interval express the precision and uncertainty associated with a particular sampling method. 
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Comparing 
t- & z- Distribution 





Z-SCOre : confidence interval from population mean 
t-score: confidence interval from sample mean 


Z distribution 
(standard normal) 


t-distribution 


(n close to 30) 
t-distribution 
(n smaller than 30) 





The t-distribution follows the same shape as the z-distribution, but corrects for small sample sizes. 
For the t-distribution, you need to know your degrees of freedom (n - 1) 


Notation  N(p,0”) 
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where F; is the hypergeometric function 





(0 for y > 1, otherwise undefined 


Standard Error of the Mean 
(SEM) 





standard deviation sample mean 


o measures of variability of the sample mean 
o larger sample size (n), the smaller the SEM 


o smaller SEM more precise estimation of o PO A ae 
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Appendix 





Example: Critical value 
You survey 100 Brits and 100 Americans about their television-watching habits, and find that both 
groups watch an average of 35 hours of television per week. 


The SD for the two distributions 
10 for the GB estimate. 
5 for the USA estimate. 


Average hours of TV watched per week, 
G.B.(orange) vs USA (blue) 
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The estimate for Great Britain (95% CI =33.04, 36.96) 
than for the US (95% CI = 34.02, 35.98) 
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Average hours of TV per week, G.B. 
vs USA, with 95% confidence interval 
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scipy.stats methods/functions 


norm.pdf(): probability density function 


norm.cdf(): cumulative distribution function 


norm.ppf(): percent point function (inverse of cdf) 


norm.sf(): survival function (1-cdf) 


norm.isf(): inverse survival function (inverse of sf) 


Normal Distribution - PDF 
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